首页> 外文OA文献 >Dynamic texture and scene classification by transferring deep image features
【2h】

Dynamic texture and scene classification by transferring deep image features

机译:通过转移深度图像进行动态纹理和场景分类   特征

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Dynamic texture and scene classification are two fundamental problems inunderstanding natural video content. Extracting robust and effective featuresis a crucial step towards solving these problems. However the existingapproaches suffer from the sensitivity to either varying illumination, orviewpoint changing, or even camera motion, and/or the lack of spatialinformation. Inspired by the success of deep structures in imageclassification, we attempt to leverage a deep structure to extract feature fordynamic texture and scene classification. To tackle with the challenges intraining a deep structure, we propose to transfer some prior knowledge fromimage domain to video domain. To be specific, we propose to apply awell-trained Convolutional Neural Network (ConvNet) as a mid-level featureextractor to extract features from each frame, and then form a representationof a video by concatenating the first and the second order statistics over themid-level features. We term this two-level feature extraction scheme as aTransferred ConvNet Feature (TCoF). Moreover we explore two differentimplementations of the TCoF scheme, i.e., the \textit{spatial} TCoF and the\textit{temporal} TCoF, in which the mean-removed frames and the differencebetween two adjacent frames are used as the inputs of the ConvNet,respectively. We evaluate systematically the proposed spatial TCoF and thetemporal TCoF schemes on three benchmark data sets, including DynTex, YUPENN,and Maryland, and demonstrate that the proposed approach yields superiorperformance.
机译:动态纹理和场景分类是理解自然视频内容的两个基本问题。提取强大有效的功能是解决这些问题的关键步骤。然而,现有的方法遭受对变化的照明,视点改变,甚至摄像机运动的敏感性,和/或缺乏空间信息。受到深层结构在图像分类中成功的启发,我们尝试利用深层结构提取用于动态纹理和场景分类的特征。为了应对训练深层结构的挑战,我们建议将一些先验知识从图像域转移到视频域。具体来说,我们建议使用训练有素的卷积神经网络(ConvNet)作为中级特征提取器,从每个帧中提取特征,然后通过将中级的一阶和二阶统计量级联来形成视频的表示形式特征。我们将此两级特征提取方案称为“转移卷积网络特征(TCoF)”。此外,我们探索了TCoF方案的两种不同实现方式,即\ textit {spatial} TCoF和\ textit {temporal} TCoF,其中均值去除帧和两个相邻帧之间的差用作ConvNet的输入,分别。我们在三个基准数据集(包括DynTex,YUPENN和马里兰州)上系统地评估了建议的空间TCoF和时间TCoF方案,并证明了所提出的方法产生了优越的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号